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1. Introduction 


Precon ditioned conjugate gradient (PCG)-methods have been a very popular and 
successful class of methods for solving large systems of equations arising from discretiza- 
tions of elliptic partial differential equations. With the advent of parallel computers in 
recent years, there has been increased research into effectively implementing these 
methods on various parallel architectures. In this paper, we present a class of precondi- 
tioners for elliptic problems built on ideas from the digital filtering theory and imple- 
mented on a multilevel grid structure. Our goal is to work towards preconditioners that 
are both highly parailelizable and rapidly convergent. 

The idea of preconditioning is a simple one but is now recognized as critical to the 
effectiveness of PCG methods. Suppose we would like to solve the symmetric positive 
definite linear system Ax=b, where A arises from discretizing a second-order self-adjoint 
elliptic partial differential operator. A good preconditioner for A is a matrix M that 
approximates A well (in the sense of producing a spectrum for the preconditioned system 
M~ l A clustering around 1 and having a small condition number) and for which the 
matrix vector product M~ l v for a given vector v can be computed efficiently. With such 
a preconditioner, one then solves in principle the preconditioned system M~ l Ax=M~ l b 
by the conjugate gradient method. 

Since an effective preconditioner plays a critical role in PCG methods, many classi- 
cal preconditioners have been proposed and studied, especially for second order elliptic 
problems. Among these are the Jacobi preconditioner (diagonal scaling), the SSOR 
preconditioner [3], the incomplete factorization preconditioners (ILU [25] and MILU [15]) 
and polynomial preconditioned [2], [19]. These preconditioners have been very successful, 
especially when implemented on sequential computers. 

In the parallel implementation of PCG methods, the major bottleneck is often the 



parallelization of the preconditioned since the rest of the PCG methods can be parallel- 
ized in a straightforward way. Unfortunately, previous works [12], [16] have shown that 
for many of the classical preconditioners, there is a fundamental tradeoff in the ease of 
parallelization and the rate of convergence. A principal obstacle to parallelization is the 
sequential manner in which many preconditioners use in traversing the computational 
grid — the data dependence implicitly prescribed by the method fundamentally limits 
the amount of parallelism available. Re-ordering the grid traversal (e.g. from natural to 
red-black ordering) or inventing new methods (e.g. polynomial preconditioners) to 
improve the parallelization alone invariably has an adverse effect on the rate of conver- 
gence [12], [23], 

The fundamental difficulty can be traced to the global dependence of elliptic prob- 
lems. An effective preconditioner must account for the global coupling inherent in the ori- 
ginal elliptic problem. Preconditioners that use purely local information (such as red- 
black orderings and polynomial preconditioners) are fundamentally limited in their abil- 
ity to improve the convergence rate. On the other hand, global coupling through a 
natural ordering grid traversal is not highly parailelizable. The fundamental challenge is 
therefore to construct effective global coupling that are highly parailelizable. Ideas along 
this line have of course been explored in the development of multigrid methods as solu- 
tion [10], [17] as well as preconditioning techniques [20], [21] and the more recently pro- 
posed hierarchical basis preconditioner [8], [29]. 

We are thus led to the consideration of preconditioners which share global informa- 
tion through a multilevel grid structure (ensuring a good convergence rate) but perform 
only local operations on each grid level (and hence highly parailelizable.) Since we are 
using the multilevel iteration within an outer conjugate gradient iteration, we have more 
flexibility in terms of the choice of inter- and intra-grid level operators, such as interpola- 
tion, projection and smoothing. One preconditioner of this type has been proposed 
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recently by Bramble, Pasciak and Xu [9] and Xu [28]. The methods that we propose in 
this paper are quite similar to their preconditioner and our digital filtering framework 
can be looked at as providing an alternative view of their method. It also allows the 
flexibility in deriving several variants. A major difference in the approach taken by 
Bramble, et al. and this paper from multigrid methods is that the smoothing operation in 
multigrid methods is replaced by a simple scaling operation. Other types of multilevel 
preconditioners have been studied by Vassilevski [27], Axelsson-Vassilevski [6], [7], 
Kuznetsov [24] and Axelsson [4]. 

The outline of the paper is as follows. In Section 2, we describe our framework for 
deriving multilevel filtering preconditioners for a model problem. The basic framework is 
then extended to more general problems in Section 3. In Section 4, we briefly survey 
several other preconditioners of the multilevel type. Numerical results for (model, vari- 
able coefficient and discontinuous coefficient) problems in 2D and 3D are presented in 
Section 5, comparing tbe performance of several multilevel preconditioners, including the 
usual multigrid method as a preconditioner, the hierarchical basis preconditioner and the 
method of Bramble-Pasciak-Xu. Some brief concluding remarks are given in Section 6. 

We note that the main emphasis of the present paper is on the convergence 
behavior of these multilevel preconditioners — no attempt is made to assess their paral- 
lel efficiency. That will be the subject of a forthcoming paper. 
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2. Multilevel filtering preconditioners: fundamentals 
2.1 Motivation 

Consider the one-dimensional discrete Poisson equation on [0,1] with zero boundary 
conditions on a uniform grid , 

( - jE + 1 - l£-> n - 1, ■ •• , JV-1 , (2.1) 

where N = h~ l = 2 L with integer L > 1 and E is the shift operator on H* . We denote 

the above system by 

A u - f , 

where A , u and / correspond to the discrete Laplacian, solution and forcing functions 
respectively. Clearly, A is a tridiagonal matrix with diagonal elements — 1 and — fc. 

It is well known that the matrix A can be diagonalized as 

A = W T A a W , (2.2) 

where A a is a diagonal matrix 

diag( Xj , • • • , X* , • • • , \ N _ t ) , X t *l- cos (kirh) , 
and W is an (N— l) 2 square matrix whose ikth row is 

v>k = (-j^)* ( sin(knh) , • • • , ain(kxnh) , ■ • • , sin{kx{N— l)h ) ) . (2.3) 

The diagonalization of the matrix A can be interpreted as the decomposition of the driv- 

ing and solution functions into their Fourier components, i.e. 

«* - E u n sin(lirnA) , }„ - {!-)* £f n ain{kxnh) , n - 0, 1, 2, • • • , N . 

;v »-l t* a-1 

One can easily verify that u* and f k are related via 

A{k) “ !k . * “ 1. 2, .... N — 1 , 

where 

A(k) = \ k = 1 — cos(kirh) , (2.4) - 

is known as the spectrum of the discrete Laplacian. 
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In order to invert A , we can make use of (2.2) and obtain 

A~' « W T \X l W . (2.5) 

The above procedure also serves as the general framework for fast Poisson solvers in 

higher dimensional cases. However, fast Poisson solvers are not generally applicable for 
nonseparable elliptic operators and irregular domains. Instead, we want to find good 
approximations to this solution procedure which are extensible to more genera] problems 
and then use them as preconditioners. The fundamental idea is to avoid the use of 
FFT’s but instead use a sequence of filtering operations to approximately achieve the 
desired spectral decomposition. This explains the motivation and the name of the mul- 
tilevel filtering (MF) preconditioner proposed in this paper. 

2.2 Piecewise constant approximation of the spectrum 

Our main idea for deriving the MF preconditioner for A is to divide all admissible 
wavenumbers into bands and to approximate the spectrum A{k) at each band with some 
constant. To be more precise, consider the following piecewise constant function in the 
wavenumber domain 

P{k) = c, , k£B, , 1 <1 <L , 

where 


B, - { k : 2'- 1 < k < 2‘ and k el), 

is the /th wavenumber band. Let Ap be the diagonal matrix with P(k) as the k th diag- 
onal element, i.e. 

A P « diag( P(l) , P{2) , • • , P{N- 1) ) , (2.6) 

and P “ W T Kp W . Then, the P-preconditioned Laplacian becomes 


P~ l A = W T A p ~i A W , 

where 


A 








V-i 




X 


^i 


cl 


)• 
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The question is how to choose appropriate C/’s to reduce the condition number k(P l A). 
Suppose that we can find c/’s so that 

Ci< — <C 2 , k e B t , 1 < / < L , 

c i 

where C 1 and C 2 are positive constants independent of A. Then, P and A are spectrally 
equivalent. There are many ways to achieve this goal. For example, we can choose any 
eigenvalue X within band B{ to be the constant c*. For the following discussion, let us 
consider the choice, 

c t = . (2.7) 

The ratio of A(k) and P(£) is then bounded by 

4 1 - 1 (l-cos(2- A+, - 1 jr)] <p-\k)A(k) < 4 L ~ l [l-cos(2- i+, jr)] , 
for k EB t . The largest and smallest values of P~ 1 (k)A(k) for k £B are bounded 

respectively 

W^) * max £ -1 (*)A(A) < m ( W \ L ~ l |l-cos(2“ i+, jr)] < -y- , 

and 

= min P -1 (A)y4(A) > [l-cos(2“ I+ ' _1 jr)J > 1 . 

Note that the last inequalities in above equations hold independent of L , or equivalently, 
the grid size A. Thus, the condition number k of the preconditioned operator P~ l A is 
bounded by a constant 

-2 

k(P~ 1 A )<-?-« 4.93. 

2 

We plot the spectra A (A), P~ l {k) and P~ l (k)A(k) in Figure 2.1 for N « A -1 - 256 with 
c r defined in (2.7). 

2*3 Decomposition and synthesis based on filtering 

The preconditioning procedure 
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P~ x r = W T Kp l Wr , (2.8) 

consists of three building blocks: decomposition (WQ, scaling (Ap 1 ) and synthesis {W T ). 

Let us rewrite (2.8) as 

P-'r «( h — WfW, )r , 
i-1 C l 

where W h 1 </ <Z,, are (TV — l) 2 square matrices which have the same 2 ,_1 to 2 l — 1 
rows as W and zero vectors for remaining rows. Consequently, we have 


W, T W,w k = 


w k 

0 , 


k<EB, 

otherwise 


(2.9) 


where w k is defined in (2.3). From (2.9), we see clearly that W t functions as an ideal 
bandpass filter for band B t . Although it is possible to implement the ideal bandpass 
characteristics (2.9) with FFT or bandpass filters of size 0{N), the corresponding imple- 
mentation is either not easily generalizable or simply too expensive ( 0(N ‘ ) complexity ). 
Instead, we want to approximate the ideal bandpass filter W t with nonideal bandpass 
filters F l 


f - 
1 0 , 


... * € B t 

Ft Ft w k ~ I n otherwise ’ 

in such a way that F t can be implemented cost-effectively for general problems. Note 
that F t is in general a dense matrix of size (N— l) 2 . The resulting preconditioner is in 
form 


M~ x r =( t—F, T Fi ) r . (2.10) 

/-l c i 

Before the detailed discussion of implementing F t , 1 </<L, with digital filters, it is 
worthwhile to summarize the similarities and differences between the fast Poisson solver 
(2.5) and the MF preconditioning (2.10). They are both based on the spectral decomposi- 
tion idea. The fast Poisson solver decomposes a function into its Fourier components 
through FFT while the MF preconditioner decomposes it approximately into a certain 
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number of bands through filtering. The filtering operations, which correspond to local 
averaging processes, can be easily adapted to irregular grids and domains and variable 
coefficients. In contrast, FFT is primarily applicable to constant coefficient problems 
with regular grids and domains. Besides, for the fast Poisson solver we usually require 
the detailed knowledge of the spectrum. But for the MF preconditioner we only have to 
estimate how the spectrum varies from one band to another. 

In the context of multirate signal processing[l3], the separation of a function into 
several components each of which is confined in a narrow wavenumber band is known as 
the filter bank analyzer and the reverse process is the filter bank synthesizer. Although 
there exist many ways to implement the filter bank analyzer ( F t , 1 </<£,) and syn- 
thesizer (F t , l</<£), a simple design illustrated by the block diagram of Figure 2.2 will 
be sufficient for our purpose. This design is based on the cascade of a sequence of ele- 
mentary filters H L , Hi_ u • • • , where the function of H t is to preserve Fourier com- 
ponents contained in bands B\, • • • , and to eliminate Fourier components con- 
tained in band Bj. From Figure 2.2, we see that F{ are related to elementary filters H t 
via 


f l =i-h l , 

F,-(I -H,) [ n H, 1, 2 < / < L—l , 

4+1 

L 


(2.11a) 

(2.11b) 


F x - H, . (2.11c) 

It is not hard to check that we can obtain components in bands B L and B x from the 
outputs of F l and F x . The product of a sequence of elementary filters appearing in 
(2.11b) leads to the band BjU • ■ • U B t , from which the band B t can be separated by 
using the filter I — H t . Thus, the problem of designing the filter bank F t , 1 </<£., is 
transformed into an equivalent one of designing elementary filters H t , . 
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2.4 Design of elementary filters 

Consider the design of the elementary filter Hi appearing at the first stage. It is 
desired that the filter Hi = W T W has the following ideal lowpass characteristic, 


* h, 0<*<2^ 

L\ ) }o , 2 i_1 < k <2 l ' 

where fti{k) is the Hh element of the diagonal matrix K Hl so that we are able to 
separate the function r into two bands: the high wavenumber band ( I—H L )r and the 


low wavenumber band H L r. 

We will approximate the above ideal filter with a nonideal lowpass filter of size 

( 2 / + 1 ), 


J . 

Hl,j — a o + £ a j ( E 1 + E~ 3 ) . (2.12) 

i - 1 

where the coefficients a 0 and a ; ’s are to be determined. In order to define the operation 

J 

H L,J v n - do + E a ) ( *>!>+/ + v n-j ) 

J-l 

for any vector v n appropriately, the odd-periodic extension of v n is assumed, 


v_„ = — v„ and v n+2p Af = v„ , for integer p . 

Consequently, the filter H L J corresponds to a circulant matrix. The above odd-periodic 

assumption is only used for analyzing and designing filters. The actual implementation 

of the MGMF algorithm (see Section 3.5) does not rely on this assumption. 

There are numerous ways to determine the coefficients <>q and ’s depending what 
approximation criteria to be used. The operator Hi j has the eigenfunction s\n{kirnh ) 
with the eigenvalue 


L,j( k ) = a 0 + 2 £ a j cos(knjh) . 

Here we consider a class of lowpass filters based on the following two criteria: 
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1 


(1) til Ay) “y #!,/(*) -J = - ], 

(2) — 1 and the first / th derivatives (1 < j < J) of i^t t /(0) are all zero. 

The first criterion implies that the function fii t j(k) ~ ~ is odd symmetric with respect 

N 

to * * — . A direct consequence of this criterion is that 


<*o = y and oy = 0 , j positive even . 

The second criterion, called the maximally flat criterion [18], requires the approximation at 
the origin to be as accurate as possible. It is used to determine a ; with odd j. In Table 
2.1, we list coefficients for / = 1, 3, 5 obtained according to criteria (1) and (2) and 
plot their spectra in Figure 2.3 with N ** 2 s = 256. The larger J becomes, the better the 
approximation is. 


J 

«0 

«i 


«6 

1 

1 

1 

0 

0 


2 

4 



3 

1 

9 

-1 

0 


2 

32 

32 


5 

1 

150 

-25 

3 

V 

2 

512 

512 

512 


Table 2.1: Coefficients of a class of nonideal lowpass filters 

As illustrated in Figure 2.2, the low wavenumber band of the function r is used as 
the input to the filter at the next stage. The filter Hi-\ can be constructed with 
the same set of coefficients used by Hi, i.e. 

Hl-u - «o + E «,• ( + r* ) . (2.13) 

i-l 

Comparing (2.12) and (2.13), we see that the only difference between H LJ and H L-l } is 
the position of grid points used for averaging. For the first-stage filter Hi Jt local 




- 11 - 


averaging is used. For the second-stage filter we consider averaging between 

points separated by 2h. This design is due to the following reason. From (2.13), we see 
that the filter Hi_ x j has the spectrum 

^L-i,A k ) ~ a o + 2 £) a j cos{krrj2h ) , 
i - 1 

and that is related to ^i t j{k) via 

&L-iAk) = &iA 2k) - 

Consequently, for functions consisting only, of components in low wavenumber region 
l<k Hl-\ behaves like a lowpass filter which preserves components in the region 

\<k <2 l ~ 2 and filters out components in the region 2 L ~ 2 <k <2 L ~ l . However, note that 
Hi, l <L is not a lowpass filter with respect to the entire wavenumber band. 

By applying the same procedure recursively, we can define the general elementary 
filter H t on a uniform infinite grid as 

H,,j - «o +£«,-( + E ) , 2 < / < L , (2.14) 

where the coefficients a ; ’s are listed in Table 2.1. The spectrum of H t J is 

« ao + E a j cos t(kvj2 L ~ l h) , 2 < l < L . (2.15) 

J m i 

It is clear from (2.14) that the elementary filter Hij is symmetric. So is the bandpass 
filter F[. The construction of the bandpass filter F { with elementary filters Hi is illus- 
trated in Figure 2.4, where /—I- 1 and /■* 1 are chosen as example. We know from 
(2.11) that /■*_,-( / -H l _ x )H l . 

The above discussion is based on the odd-periodic property of the sequence v„. 
However, this may not be easily implementable for general multidimensional problems. 
The difficulty arises when the size of H t> j is so large that it operates on points outside 
the domain. There are two possible solutions. It may be preferable to construct filters of 
larger size by the repeated application of filters of smaller size. For example, we can 
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apply the filter Hi j (2.12) with J ** 1 twice. This is equivalent to a filter of size 5, 

- ( i*- + i + \e f - ±E-' + + } + \e + ■ 

Another possibility is to apply smaller filters at points close to boundaries and larger 
filters at points far away from boundaries. 

Note also that, for fixed J, the size of the elementary filter H( j increases as / 
decreases. However, this problem can be resolved by incorporating the multigrid discreti- 
zation structure into the above multilevel filtering framework as described in Section 3.1. 

2.5 Fourier Analysis and higher dimensional cases 

Since the preconditioner A/ -1 = and the Laplacian A share the same 

i 

eigenvectors, i.e. Fourier sine functions, the spectrum and condition number of the MF- 
preconditioned Laplacian can be analyzed conveniently by Fourier analysis. From (2.11), 
we know the following spectral relationship 


h.A*) - 1 - At .,(*) , 

KAV - ( I - d,j(k) ) I h fi'jk) ] , s < / < l - 1 , ( 2 . 16 ) 

A./(*) - n D rJ (k) , 

r-2 

where 1</<Z,, are given by (2.15). Using (2.4), (2.7) and (2.16), we can deter- 

mine the spectrum of M~*A , 


HM-'A)- lir\k]A(k) - i:—f, T j(k)t-,j(k)A(k) . 

/-I c l 

The spectrum X(A/ -1 A ) is plotted as function of k with J = 1, 3, 5 and A -1 « 256 in 
Figure 2.5. We should compare these spectra with that in Figure 2.1 based on the ideal 
filtering assumption. All of them have one common feature. That is, eigenvalues are 
redistributed in such a way that there exist many local maxima and minima. The condi- 
tion numbers for J * 1, 3, 5 are 2.50, 1.88 and 1.93 respectively. Note that these 
numbers are in fact smaller than the condition number 4.93 obtained with ideal filtering. 
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The generalization of the MF preconditioner to two- or three-dimensional problems 
on square or cube domains can be done straightforwardly. For example, we may con- 
struct the two-dimensional elementary filter by the tensor product of one-dimensional 
elementary filters along the x- and y-directions, 



which can further be simplified by using operator algebra[l4j. For example, the 
coefficients for i can be written in stencil form as 


Ht , . 


16 


1 2 1 
2 4 2. 
1 2 1 


(2.17) 


Similarly, the three-dimensional elementary filter can be obtained by the tensor product 
of three one-dimensional filters along the x-, y- and ^-directions. 


The condition numbers of one-, two- and three-dimensional MF-preconditioned 
Laplacians with two types of nonideal filters (7*1 and 7*3) are computed and plot- 
ted as function of the grid size h in Figures 2.6 (a) and (b). These figures show that M 
and A are spectrally equivalent. 
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3. Multilevel filtering preconditionera: generalizations 

In Section 2, we discuss the construction of the MF preconditioner for the model 
Poisson problem based on a single discretization grid. This section will discuss the gen- 
eralization of this preconditioning technique so that it can be implemented more 
efficiently and applied to more general self-adjoint elliptic PDE problems. 

3.1 Multigrid multilevel filtering (MGMF) 

The filtering operation described above is performed at every grid points at all lev- 
els 2</<£. Since there are O(logiV) levels and O(JN) operations per level, where N 
and J denote the order of unknowns and the filter size respectively, the total number of 
operations required is proportional to 0(JN\ogN). However, since waveforms consisting 
only of low wavenumber components can be well represented on coarser grids, we can use 
the multigrid philosophy[l0],[l7] and incorporate the multigrid discretization structure 
into the filtering framework described in Section 2. That is, we construct a sequence of 
grids Cl t of sizes h t —0( 2~‘), 1 </<!,, to represent the decomposed components. Then, 
the total number of unknowns is O(N) and consequently the total number of operations 
per MF preconditioning step is O(JN). Note that / is a constant independent of N. 

The corresponding block diagram is depicted in Figure 3.1. The preconditioners 
illustrated in Figures 2.2 and 3.1 are called the SGMF and MGMF preconditioners 
respectively. Note that the MGMF preconditioner is obtained by inserting down- 
sampling (//“*) and up-sampling (7/_j ) operators into the SGMF preconditioner. With 
the notation commonly used in the multigrid literatures, the down-sampling and up- 
sampling operators for grids ft, {h^^h) and (A=2 X ’“ ,+I ) can be defined as 


0 0 0 

/-I 

0 0 0 

0 1 0 

1/-1 : 

0 10 

0 0 0 

l 

0 0 0 


It is easy to verify that a lowpass filter followed by a down-sampling operator is the 
same as the restriction operator in MG methods while a upsampling operator followed by 
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a low pass filter is equivalent to the interpolation operator [22]. 


3.2 Lowpaaa versus bandpass filtering 

To save computational work, we can further simplify the MGMF preconditioner in 
Figure 3.1 by deleting the paths and the associated work corresponding to I — H { . As 
given in Figure 3.2, we have the modified MGMF preconditioner 


«■*>■ = ( £-j-C, r C, ) r , (3.1) 

i-l d t 

where 

G l - / , 

L 

G, - n +i /; _1 H, , for 2 < / < L -1 , 

G\ - h 2 n I>- 1 H, . 

Note that the bandpass filters F 1 / in the MF preconditioner A/ have been replaced by the 
lowpass filters G t in the MF preconditioner Q. By choosing d/’s appropriately, we can 
make Q behaves very similarly to M as described below. With the MF preconditioner 
implemented by (3.1), Fourier components of band B t exist in the first L—l+1 levels and 
these components are multiplied by d£"\ • • ■ , d/ -1 respectively. Therefore, the scaling 
constants d/’s are implicitly defined via 



Solving (3.2) for d/ gives: 


(3-2) 


4 "ft “d di - , l -L - 1 , • • • , 1 . (3.3) 

C l ~ C I + 1 

However, we find from numerical experiments that the parameter sets {c/} and {d/} used 
in Figure 3.2 give about the same convergence rate. This can be explained by the obser- 
vation that, for small /, d, « c/ since C/ -1 » c/+j . 
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Tbe preconditioning Q~ l r can be viewed as a degenerate inultigrid method, for 
which we have a sequence of restriction and interpolation operations but where the error 
smoothing at each grid level is replaced by an appropriate scaling. 

3.3 Discretization with nonunifonn grids 

The above observation leads us to generalize the MF preconditioner to the case of 
nonuniform grids commonly obtained from the finite-element discretization. That is, one 
can view projection as decomposition and interpolation as synthesis and any multigrid 
method can be used as an MGMF preconditioner if we replace the potentially more 
expensive error smoothing by a simple scaling. It is well known that the eigenvalue in 
band B t (see Section 2.2) behaves like 0(h~ 2 ), where h t describes approximately the grid 
spacing for level / [9], Therefore, a general rule for selecting the scaling constant c t at 
grid level l is 

c/ = 0(A/ -2 ) . 

This generalized version is closely related to the preconditioner by Bramble, Pasciak and 
Xu [9). They derived their preconditioner in the finite-element context discretized with 
the nested triangular elements. From our filtering framework, the corresponding elemen- 
tary filter H L takes the form 

Oil 

Hljpx '• 4" 12 1. (3.4) 

8 110 

which is different from l given by (2.17). We can derive other filters from (3.4) by 

* • 

applying it more than once. For example, by applying it twice, we get 

0 0 12 1 
0 2 6 6 2 
Hl.TBPX ' "TT 1 6 10 6 1 
2 6 6 2 0 
12 10 0 


(3.5) 



- 17- 


In order to eliminate the directional preference, we can apply (3.4) in alternating direc- 
tions which gives a symmetric filter: 


3.4 Diagonal scaling 


Hl,adbpx • 


16 


0 12 10 

1 4 6 4 1 

2 6 8 6 2 

1 4 6 4 1 

0 12 10 


(3.6) 


The MF preconditioner is designed to capture the spectral property (or h- 
dependency) of a discretized elliptic operator but not the variation of its coefficients. 
This is also true for the hierarchical basis and BPX preconditioners. In order to take 
badly scaled variable coefficients into account, we use the MF preconditioner in associa- 
tion with diagonal scaling in our experiments[16]. The diagonal scaling is often used for 
cases where the diagonal elements of the coefficient matrix A vary for a wide range. 
Suppose that the coefficient matrix can be written as 


A - D*AD* , 

where we choose D to be a diagonal matrix with positive elements in such a way that 
the diagonal elements of A are of the same order, say, 0(1). Then, in order to solve 
A u —f, we can solve an equivalent problem A ii «/, where u «= D*u and 
/ ■= D~*J , with the MF preconditioner. There exist other ways to incorporate the 
coefficient information into preconditioners of the multilevel type, say, to use the Gauss- 
Seidel smoothing suggested by Bank et al.[8]. 

3.5 Summary of practical MGMF algorithm 

Given a sequence of grids Cl t , 1 </<L, down-sampling (// +1 ) and up-sampling (l[ +l ) 
operators between grids Cl t and n (+ll and appropriate elementary filters H t defined on 
flj, the algorithm corresponding to the block diagram given by Figure 3.2 can be sum- 
marized as follows, 
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Decom position: 

V L r, 

for l = L -l , • • ■ ,2 
v i ;a * iLxHt+xv^, 
t>! :«= /f 2 V2, 

Scaling: 

for / = L , • • • ,1 

tt>i :■» 

Synthesis: 

:= w 2 + H 2 w lf 
for / = 2 , • • • , L 

w i + 

Q~ l r s L 


Table 3.1 Computation of (? _1 r 


This is the MGMF algorithm implemented in Section 5. 
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4. Brief survey of multilevel preconditioners 

Id this section, we very briefly survey other multilevel preconditioners that have 
been proposed in the literature and their relationships to one another. 

4.1 Multigrid preconditioner (MG) 

A natural choice for a multilevel preconditioner is to use a fix number of cycles of a 
conventional multigrid method. This approach has been explored early on in the 
development of multigrid methods [20], [21]. The basic operations on each grid are inter- 
polation, projection and smoothing operations, each of which can be easily designed to be 
highly parallelizable. For example, in the V-cycle strategy, each grid is visited exactly 
twice in each preconditioning step, once going from fine to coarse grids and once coming 
back from coarse to fine. However, for highly irregular problems, such as singularities in 
the solutions due to re-entrant corners and highly discontinous coefficients, it is not clear 
how to choose the smoothing operations and the performance can deteriorate. 

4.2 Hierarchical basis preconditioner (HB) 

Another preconditioning technique of multilevel type is the hierarchical basis 
method [8], [29]. The name refers to the space of hierarchical basis functions defined on 
the grid hierarchy. The usual nodal basis functions are used except that those defined at 
grid points on a given level which also belong to coarser levels are omitted. Let the 
hierarchical basis functions be denoted by il>j, where l denotes the grid level and j the 
index of the basis function on that level. Then the action of the inverse of the hierarchi- 
cal basis preconditioner M on a vector v can be written as: 

M~ l v - = SS T v, 

t i 

and can be computed by a V-cycle with the matrix S T corresponds to a fine to coarse 
grid traversal and S a coarse to fine traversal. On each level, only local operations are 
performed. In 2D, the condition number of the preconditioned system can be showned io 
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grow like ©(log 2 /* -1 ), which is a very slow growth. Unfortunately, this nice property is 
lost in 3D where the growth can be proved to be 0(A -1 )[26],[29j. However, these 
theoretical results are proven under much weaker regularity assumptions than for the 
multigrid methods and the computational work per step is 0(A -1 ) even for highly 
nonuniform and refined meshes. For numerical experiments on parallel computers, see 
[1].[16]. 

4.3 Method by Bramble-Pasciak-Xu (BPX) 

Very recently, Bramble-Pasciak-Xu [9], [28] proposed the following preconditioner for 
second -dr der elliptic problems in R* \ 

m-'v = EV'IX •’,*/)#/ , 

< } 

where are the nodal basis functions and h t is measure of the mesh size of grid level /. 
Since the form of their preconditioner is very similar to that for the hierarchical basis 
method, the computations can be arranged in a similar way via a V-cycle. They proved 
that the condition number of the preconditioned system can be bounded by 0(logA _1 ) 
for problems with smooth solutions, by Oflog^” 1 ) for problems with crack type singu- 
larities and by ^(log 3 ^ -1 ) for problems with discontinous coefficients. In 3D, this is a 
significant improvement over the hierarchical basis method. 

4.4 Algebraic multilevel preconditioners (AMP) 

Vassilevski[27j proposed a different approach to derive multilevel preconditioners. 
He used the standard nodal basis functions and a multilevel ordering of the nodes of the 
discretization, in which nodes at a given level belonging to a coarser grid are ordered 
after the other nodes. He then considered an approximate block factorization of the 
stiffness matrix in this ordering, in which the successive Schur complements at a given 
grid level are approximated by iteration with the preconditioner of the stiffness matrix 
recursively defined at the current level. He showed that, with one iteration at each level, 
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the condition number of the preconditioned system can be bounded by 0(logh -1 ). A 
similar method has also been proposed by Kuznetsov [24). Later, Axelsson-Vassilevski 
[6],[7J improved this bound to 0(1) but carrying out recursively more (Chebychev) itera- 
tions with the preconditioner at each level. Axelsson [4] also showed that the same tech- 
nique can be applied when hierarchical basis functions are used instead of the nodal 
basis. Note that when the number of iterations at each level exceeds 1, the grid traversal 
differs from all the previously mentioned V-cycle based methods. At this time, we have 
not included non-V-cycle type preconditioners in our numerical comparisons but plan to 
do so in the future. 

4.5 Relationship among multilevel preconditioners 

As can be seen from the presentation above, various multilevel preconditioners share 
some similarities. Mo6t of the multilevel preconditioners are in the form of a multigrid 
V-cycle (MG, HB, BPX and MF) except AMG methods. The MF preconditioner is very 
similar to the BPX method. The MF method allows some flexibility in the choice of 
filters (basically any multigrid residual averaging operators can be used) and does not 
depend on the use of a finite element discretization with nested nodal basis functions. It 
also allows a single grid (i.e. non-multigrid) version which may better suit massively 
parallel architecture computers. On the other hand, the finite element framework allows 
an elegant proof of the asymptotic convergence behavior for rather general problems as is 
done in (9), [28] whereas the filtering framework is rigorously provable for constant 
coefficient model problems only (although much more detailed information can be 
obtained for them.) 

Finally, it is interesting to compare these preconditioners with the conventional 
multigrid method as an iterative method (instead of as a preconditioner). Several of the 
preconditioners have the form of a conventional multigrid cycle, except that the smooth- 
ing operations are omitted. For less regular problems where a good smoothing operator 



) 


- 22 - 


b hard to derive and could be quite expensive, one step of these preconditioners can be 
substantially less expensive that a corresponding step of the multigrid iteration. In a 
sense, one can view these preconditioners as efficiently capturing mesh size dependent 
part of the ill-conditioning of the elliptic operator and leaves the other sources of ill- 
conditioning (e.g. discontinuous coefficients) to the conjugate gradient iteration. The 
combination of multigrid and conjugate gradient holds the promise of being both robust 
and efficient. However, it seems that to get a spectrally equivalent preconditioner, one 
has to go beyond the V-'cycle and perform more iterations on each grid as in the AMP 


methods. 



) 
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5. Numerical experiments 

In this section, we present numerical results for two- and three-dimensional test 
problems to compare the convergence behavior and the amount of work needed for vari- 
ous preconditioners discussed previously. The preconditioners implemented are: 


HB: hierarchical basis preconditioner using linear elements for 2D and trilinear ele- 

ments for 3D problems, 

MG(i,i): multigrid preconditioner with one V-cycle, where i is the number of pre- and 

post-smoothings, 

BPXl: the BPX preconditioner for 2D problems [Hi given by (3.4)), 

BPX2: a modified version of BPX preconditioner by filtering twice for 2D problems 

[H l given by (3.5)), 


BPX3: another modified version of BPX preconditioner by filtering twice but using 

linear elements of different orientations for 2D problems [H L given by (3.6)), 

MGMFl: the MGMF preconditioner with the 9-point (2.15) or 27-point filter for 2D 

and 3D problems respectively, 

MGMF2: a modified version of MGMF preconditioner in which the 9-point (or 27-point) 

filter is applied twice, 


MGMF3: another modified version of MGMF preconditjoner in which the 9-point (or 

27-point) filter is applied once at the finest grid level (to have a smaller 
amount of work compared to MGMF2) and twice at other grid levels (to 
achieve a faster convergence rate compared to MGMFl), 


RIC: 


the relaxed incomplete Cholesky preconditioner [5] is included for comparison 

purpose. For the relaxation factor, we use the optimal value w = 1 — 8sin 2 -^- 

from [111. The number of iterations required for RIC can be bounded by 
O(n'). 


The operation counts per iteration (just for preconditioning) for each method for 2D 
and 3D problems are given in Tables 5.1 and 5.2 respectively. The operation counts 
include addition, multiplication and division (each is counted as one operation) and does 
not include other overhead operations such as condition checking or data copying. The 
operations required in each CG step for 2D and 3D problems are 21 N and 25 N 
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respectively. 


JVeconditioner 

Operation count Der iteration 

RIC 

9 N 

HB 

7 N 

MGfl.ll 

38 N 

BPXl 

8 N 

BPX2 

26 N 

BPX3 

26 N 

MGMFl 

9N 

_ MGMF2 

27 N 

MGMF3 

IU V 


Table 5.1: Work per iteration for preconditioners (2D) 


_ Preconditioner 

Operation count per iteration 

RIC 

13 N 

HB 

SN 

MGMFl (BPXl! 

9 N 

-MGMF2 (BPX2! 

32 N 

MGMF3 

L2JV 


Table 5.2: Work per iteration for preconditioners (3D) 


From Table 5.1, we observe that the operation counts per iteration for BPXl and 
MGMFl are much less than that of the MG(l,l) preconditioners because the former 
preconditioners do not need smoothing which is expensive. In general, for 2D problems, 
MG(i,i) preconditioner takes (38 + 32X(t — 1)) N operations. For example, MG(3,3) 
preconditioning requires 102 N operations. Also, note that the application of filtering 
twice requires about three times the work of filtering once. This is because by filtering 
twice the filter stencil is extended from 9- point to 25-point (about three times as many 
points). 

For 3D problems, the BPXl (BPX2) preconditioning using trilinear elements is same 
as the MGMFl (MGMF2) preconditioning as shown in Table 5.2. The MG precondi- 
tioner has not yet been implemented for 3D problems. 

For all test problems, we use the standard 5- (or 7-) point stencil on a square (or 
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cubic) uniform mesh with A = n _1 and N = (n — l) 2 (or N = (n —l) 3 ), zero boundary 
conditions and zero initial guesses. Experimental results are given for different values of 
A and the stopping criterion is chosen to be | |r* 11/ | | r° 1 | < 10“®. Diagonal scaling 
is always used except for RJC. The six test problems are: 

(1) the 2D model problem with solution u — x(x— l)y(y— 1) e**', 


a«i-/, ft = (o,i) a , 

(2) a 2D variable coefficient problem with solution « — xe^sinirxsinJry, 


_B_ 

Bx 


—zy Bu 

Bx 


By 


Bu 

dy 


f, n-(0,l) 2 , 


(3) a 2D problem with discontinuous coefficients with / ■* 2x(l — x)+2y(l— y ), 

B 


Bx 


i x Bu 


+ 


By 


n = ( o , i ) 2 , 


where 


10 4 x > 0.5 y < 0.5 
P(x,y) = jlO -4 x <0.5 y > 0.5, 

1 elsewhere 

(4) the 3D model problem with solution u — x(l— x)y(l— y)x(l— xje**'*, 

An-/, ft - (0,1) 3 , 

(5) a 3D variable coefficient problem with solution ti — e^sinrrxsinjrysinjrz, 


_B_ 

Bx 


e .„ a« 

+ • 

— 

+ ± 

' ’ 

-IN Bu 

dx 

i 

By 

dy\ 

Bz 

Bz 


(5.1) 


(5.2) 


(5.3) 


(5-4) 


/, ft-(O.l) 3 , (5.5) 

(6) a 3D problem with discontinuous coefficients with / — 2x(l— x)+2y(l— y)+2x(l— z), 


B I / x Bu 

Tz T* 

where 


8 

By 


***••'> It 


3_ 

bz 


/ x Bu 

**■••*' a? 


/, ft - (0,1) 3 , (5.6) 


p{x,y,z) - 


10~ 4 

10 4 

1 


x > 0.5 with y < 0.5, z < 0.5 or y > 0.5, z > 0.5 
x < 0.5 with y > 0.5, z < 0.5 or y < 0.5, z > 0.5. 
elsewhere 
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The number of iterations and operation counts per grid point are plotted in Figures 5.1- 

5.6 (a) and (b) respectively. We can make the following observations from these Figures. 

1. The BPX and MGMF preconditioners have better convergence behavior compared 
to the HB preconditioner, especially for 3D problems. The HB method is competi- 
tive with the other multilevel methods only for the discontinuous coefficient prob- 
lem in 2D. 

2. The 0(log“n) convergence rate for all the multilevel methods is evident, except for 
the 3D HB method. The 3D HB method behaves like Ofh -050 ) and 0(/»'-° 7 °) for 
problems (5.4) and (5.5) which are close to the predicted theoretical result 0 (/j~°- 5 ). 
However, for the discontinuous coefficient problem (5.6), it converges more slowly 
like 0{h~ 1M ). 

3. In general, the MGMF methods perform slightly better than the corresponding BPX 
methods. Recall that the only difference between the two methods is the choice of 
the elementary filters. 

4. Filtering twice (BPX2, BPX3, and MGMF2) does help to improve the convergence 
rates for the model Poisson problem in both 2D and 3D (the MGMF2 and BPX3 
preconditioners appear to be spectrally equivalent.) However, for variable and 
discontinuous coefficient problems, filtering twice does not seem to improve the con- 
vergence rates enough to compensate for the extra work involved. 

5. The MGMF3 method is designed to incorporate the desired features of MGMFl and 
MGMF2, i.e. the good convergence property due to filtering twice and the smaller 
amount of work due to filtering once at the finest grid level. It turns out that it 
works very well. MGMF3 behaves better than MGMFl but worse than MGMF2 in 
number of iterations required. However, in terms of amount of work, MGMF3 is 
better than MGMFl and MGMF2. 
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i 

6. For small n ( approximately < 100 ), the RIC method is actually quite competitive 
with all the multilevel methods. In fact, for the discontinuous coefficient problems, 
none of the multilevel preconditioners gives better convergence rate than the RIC 
preconditioner. It appears that the RIC preconditioner captures the variation of the 
coefficients especially well. Its performance deteriorates as n gets large, as predicted 
by its inferior asymptotic convergence rate. 

7. The MG precon ditioner is among the most efficient methods for problems with 
smooth coefficients. However, it has some difficulties with problems with discon- 
tinuous coefficients. In fact, for Problem (5.3), MG(1,1) requires too many iterations 
to fit on the plot. Instead we show the results for the MG(3,3) method, which con- 
verges in a reasonable number of iterations but still requires the most work of all 
the methods. We have noticed that the performance of the multigrid methods are 
somewhat sensitive to the initial guess. In experiments with random initial guesses, 
we have observed that the performance of the multigrid methods are significantly 
improved. This may be due to the extra smoothing operations in the multigrid 
methods which are more adapt at annihilating the high frequency errors inherent in 
the random initial guess. 
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8. Conclusions 

The experimental results show that the class of multilevel filtering preconditioners 
compares favorably with the hierarchical basis and the RIC preconditioners, at least for 
problems with smooth coefficients and quasi-uniform grids such as used in our experi- 
ments. For these types of problems, the multilevel filtering and the BPX methods 
behave quite similarly to the multigrid preconditioner. What these new methods offer is 
the saving of smoothing operations which are difficult to make effective for irregular 
problems, while preserving the nice asymptotic convergence rates of multigrid precondi- 
tioners. The relative performance of the hierarchical basis method should improve for 
irregular problems on highly non-uniform and refined meshes. Even though the RIC 
preconditioner shows better convergence rate for strongly discontinuous coefficient prob- 
lems, it has a low degree of parallelism. The multilevel filtering preconditioners are very 
similar to the BPX method. What the filtering framework provides is the flexibility of 
filter design which can lead to more efficient methods. 
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Figure Captions 


Figure 2.1: Spectra of A , P~ l and P~ l A . 

Figure 2.2: Blockdiagram of the MF preconditioner with a single discretization 
grid (SGMF). 

Figure 2.3: Spectra of maximally flat lowpass filters Hu with J = 1, 3, 5. 

Figure 2.4: Spectra of H L , I-H L _ j and F L _ t . 

Figure 2.5: Eigenvalues of M~ l A with J — 1, 3, 5. 

Figure 2.6: Condition numbers of the MF-preconditioned Laplacian with (a) 7= 1 
and (b) 7=3. 

Figure 3.1: Blockdiagram of the MGMF preconditioner. 

Figure 3.2: Blockdiagram of the modified MGMF preconditioner. 

Figure 5.1: (a) Iteration and (b) operation counts for Test Problem 1. 

Figure 5.2: (a) Iteration and (b) operation counts for Test Problem 2. 

Figure 5.3: (a) Iteration and (b) operation counts for Test Problem 3. 

Figure 5.4: (a) Iteration and (b) operation counts for Test Problem 4. 

Figure 5.5: (a) Iteration and (b) operation counts for Test Problem 5. 

Figure 5.6: (a) Iteration and (b) operation counts for Test Problem 6. 























- 34- 



Figure 2.3: Spectra of maximally flat lowpass filters Hu with / - 1, 3, 5. 
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Figure 2.4: Spectra of Hi, I— Hi - 1 and 







Figure 2.6: Condition numbers of the MF-preconditioned Laplacian with (a) J — i 
and (b) J=3. 



- 38- 



Figure 3.1: Blockdi&gr&m of the MGMF preconditioner. 
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Figure 5.1: (a) Iteration and (b) operation counts for Test Problem 1. 





operatio n counts per point number of iterations 
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Figure 5.2(a): iteration counts for Test Problem 2 



Figure 5.2(b): operation coons per grid poms for Test Problem 2 



Figure 5.2: (a) Iteration and (b) operation counts for Test Problem 2. 




operetta counts per point number of iterations 
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Hgure 53(a): tena o o coants for Test Problem 3 




Figure 5.3: (a) Iteration and (b) operation counts for Test Problem 3. 



operation counts per point number of iterations 
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Figure 5.4: (a) Iteration and (b) operation counts for Test Problem 4. 




operation cowls per point number of heraUoni 
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Hgne 5_5(»): herra'oc counts for Test Problem 5 



Heme 5.3(b): operation etnas per grid point for' Test Problem J 



Figure 5.5: (a) Iteration and (b) operation counts for Test Problem 5. 
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Figure 5.6(a): iteration co un t s far Test Problem 6 




Ftgnre 5.6: (a) Iteration and (b) operation counts for Test Problem 6. 







